Dependent default and recovery: MCMC 
study of downturn LGD credit risk model 



There is empirical evidence that recovery rates tend to go down just 
when the number of defaults goes up in economic downturns. This 
has to be taken into account in estimation of the capital against credit 
risk required by Basel II to cover losses during the adverse economic 
downturns; the so-called "downturn LGD" requirement. This paper 
presents estimation of the LGD credit risk model with default and re- 
covery dependent via the latent systematic risk factor using Bayesian 
inference approach and Markov chain Monte Carlo method. This ap- 
proach allows joint estimation of all model parameters and latent sys- 
tematic factor, and all relevant uncertainties. Results using Moody's 
annual default and recovery rates for corporate bonds for the period 
1982-2010 show that the impact of parameter uncertainty on economic 
capital can be very significant and should be assessed by practitioners. 

1 Introduction 

Default and recovery rates are key components of Loss Given Default (LGD) 
models used in some banks for calculation of economical capital (EC) against 
credit risk. The classic LGD model implicitly assumes that the default rates 
and recovery rates are independent. Motivated by empirical evidence that 
recovery rates tend to go down just when the number of defaults goes up 
in economic downturns, Frye [3], Pykhtin [9] and Diillmann and Trapp [2] 
extended the classic model to include dependence between default and re- 
covery via common systematic factor. These models have been suggested by 
some banks for assessment of the Basel II "downturn LGD" requirement pQ. 
The Basel II "downturn LGD" reasoning is that recovery rates may be lower 
during economic downturns when default rates are high; and that a capital 
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should be sufficient to cover losses during these adverse circumstances. The 
extended models represent an important enhancement of credit risk models 
used in earlier practice, such as CreditMetrics and CreditRisk+, that do not 
account for dependence between default and recovery. 

Publicly available data provided by Moody's or Standard&Poor's rating 
agencies are annual averages of defaults and recoveries. These data are of 
limited size, covering a couple of decades at most. As will be shown in this 
paper, due to the limited data size the impact of the parameter uncertainty 
on capital estimate can be very significant. None of the various studies, 
including the extension works [3l[9l|2], specifically addressed the quantitative 
impact of parameter uncertainty. Increasingly, quantification of parameter 
uncertainty and its impact on EC has become a key component of financial 
risk modeling and management; for recent examples in operational risk and 
insurance, see [5j|8]. This paper studies parameter uncertainty and its impact 
on EC estimate in the LGD model, where default and recovery are dependent 
via the latent systematic risk factor. We demonstrate how the model can be 
estimated using Bayesian approach and Markov chain Monte Carlo (MCMC) 
method. This approach allows joint estimation of all model parameters and 
latent systematic factor, and all relevant uncertainties. 



Following [21 [31 19], consider a homogenous portfolio of J borrowers over a 
chosen time horizon. To avoid cumbersome notation, we assume that the jth 
borrower has one loan with principal amount Aj. The loss rate (loss amount 
relative to the loan amount) of the portfolio due to defaults is 



where Wj is the weight of loan j, Wj = Aj/ Y^ m =\ A? * s ^ ne ^ oss ra ^ e °^ 
loan j due to potential default; 1 — max(l — Rj, 0) is the recovery rate of loan 
j after default; Ij is an indicator variable associated with the default of loan 
j, Ij = 1 if firm j defaults, otherwise Ij = 0. In general Rj is not the same 
as recovery rate since the latter is subject to a cap of 1. 

Denote the probability of default for firm j by p, i.e. Pr[Jj = 1] = p. Let 
Cj be an underlying latent random variable (financial well-being) such that 
firm j defaults if Cj < where $(•) is the standard normal distribution 

and $ _1 (-) is its inverse. That is, Ij = 1 if Cj < $ _1 (p) and Ij = otherwise. 
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The value Cj for each firm depends on a systematic risk factor X and a firm 
specific (idiosyncratic) risk factor as 

Cj = V~PX+ V^pZf, (2) 

where Zf , . . . , Zj are all independent. Also, X and Z 1 ? are assumed inde- 
pendent and from the standard normal distribution. Conditional on X, the 
financial conditions of any two firms are independent. Unconditionally, p is 
correlation between financial conditions of two firms. 

The studies [21 EH IS] considered normal, lognormal and logit-normal dis- 
tributions for the recovery. It was shown in [2] that EC estimates from these 
three recovery models are very close to each other; the difference is within 2%. 
In addition, statistical tests favored the normal distribution model. Thus we 
model the recovery rate as 

Rj = p + a^/uX + ay/1 - uZj, we [0,1], (3) 

where X and Zj are assumed independent and from the standard normal 
distribution. Also, Zj and Z^ are assumed independent too. The recovery 
and default processes are dependent via systematic factor X. 

3 Economic Capital 

It is common to define the EC for credit risk as a high quantile of the distri- 
bution of loss L, i.e. 

Q q (0) =Q q = inf{z : Pr[L > z\0\ <l-q} = mi{z : F L (z\6) > q}, (4) 

where q is a quantile level; Fl(z\0) is distribution function of the loss L with 
the density denoted as f L (z\6); and 6 = (p, p, p, a, uj) are model parameters. 

The EC measured by the quantile Q q (0) is a function of 0. Typically, 
given observations, the maximum likelihood estimators (MLEs) 6 are used 
as point estimates for 6. Then, the loss density for the next time period is 
estimated as fi(z\0) and its quantile, Q q (6), is used for EC calculation. The 
distribution of L is not tractable in closed form for an arbitrary portfolio. In 
this case Monte Carlo method can be used with the following logical steps. 

Algorithm 1 (Quantile given parameters) 

1. Draw an independent sample from $(•) for the systematic factor X. 

2. For each j, draw Zj from $(•); calculate Cj and Ij. 

3. For each j, draw Zj from $(•) and find Rj = p + a\/uX + a \J\ — uZj. 
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4. Find loss L for the entire portfolio using (1), i.e. a sample from Fl(-\6). 

5. Repeat steps 1-4 to obtain N samples of L. 

6. Estimate Q q (6) using obtained samples of L in the standard way. 

Bank loans are subject to the borrower specific risk and systematic risk. 
In the case of a diversified portfolio with a large number of borrowers, the id- 
iosyncratic risk can be eliminated and the loss depends on X only. Gordy [I] 
has shown that the distribution of portfolio loss L has a limiting form as 
J — > oo, provided that each weight Wj goes to zero faster than 1/yJ. The 
limiting loss rate L°° is given by the expected loss rate conditional on X 

j 

L°° = L°°(X) = ^2w j E[I j \X\E[max.(l- 1^,0)1X1 = A(X)S(X), (5) 

3=1 

where A(X) = is the conditional probability of default of firm j 

and S(X) = £7[max(l — Rj,0)\X] is the conditional expected value of loss 
rate. That is, the distribution of L°° is fully implied by the distribution 
of X. Because L°°(X) is a monotonic decreasing function and X is from 
the standard normal distribution, the quantile of L°°(X) at level q can be 
calculated as = L°° (X = ^(l - q)). As in 0, we define EC of the 
diversified portfolio loss distribution L°°(X) as the 0.999 quantile 

EC 00 = Q~ 999 = L°° ($ _1 (0.001)) = PD x LGD, (6) 

where PD = A($" 1 (0.001)) and LGD = S^- 1 (0.001) are stressed probabil- 
ity of default (stressed PD) and stressed loss given default (stressed LGD) 
respectively. Using (|2J), the conditional probability of default is 



Also, the expected conditional loss rate for the normally distributed recovery 
rate model (J3J) is easily calculated as 



S(X) 



/oo 
max(l — /i — o\fojX — ayl — cuz, 0)fjsr(z)dz 
-oo 



o sfT^ 2/2 



[l-H- (Ty/uX)<$>(z c ) + e 

V 2lT 



where z c = (1 — // — a\fuX)/ (a^/l — u) and Jn(z) is the standard normal 
density. For the real data used in this study, it can be well approximated as 
S(X) w E[(l - Rj)\X] = l-fi- a^X. * 
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4 Likelihood 

Consider time periods t = 1, . . . , T (so that T + 1 corresponds to the next 
future year), where the following data of default and recovery for a loan 
portfolio of J t firms are observed: D t - the number of defaults in year t, 
and its realization is d t ; — D t / J t - the default rate year t, and its re- 
alization is ip t ; R t = J2fliRj(t)/Dt - the average recovery rate in year 
t, where R\(t) , . . . , Ro t {t) are individual recoveries, and its realization is 
ft- Also, the systematic factor X corresponding to the time periods is de- 
noted as Xi, . . . , X T+1 and its realization is x\, . . . , xt+i- It is assumed that 
Xi, . . . , Xt+i are independent and all idiosyncratic factors corre- 
sponding to the time periods are all independent. 

4.1 Exact Likelihood Function 

The joint density of the number of defaults and average recovery rate (D t , R t ) 
can be calculated by integrating out the latent variable X t for each t as 

f(dt,r t ) = J f (f t | dt , x t )f(d t \x t )f N (x t )dx t , (9) 

where the conditional densities f(dt\x t ) and f(f t \d t , x t ) are derived as follows. 

Given X t = x t , all firms in a homogenous portfolio have the same con- 
ditional default probability Pr [/,•(£) = l\X t = x t ] = A(xt) evaluated in (J7J). 
Thus, the conditional distribution of D t = YljLi Ij{t) is binomial 

f(dt\x t ) = Pr[A = d t \X t = x t ] = Q) (A(x t )) dt (1 - A(x t )) Jt - dt . (10) 

Often it can be well approximated by the normal distribution N(fi t , of) with 
mean [i t = JtA{xt) and variance a\ = JtA(xt)(l — A(xt)). 

Conditional on X t = Xt and D t = d t ; individual recoveries Ri(t), . . . , Rd t {t) 
are independent and from iV(/x r , of) with \i r = fi+ay/ux t and a r = a\/l — uj. 
Thus the average R t is from N(fi R , a R ) with fi R = /i r and a\ = cr^/d t , i.e. 

If recovery distribution is different from normal, the average R t can still be 
approximated by normal distribution if d t is large (and variance is finite). 
Define the data vectors D = (Di, . . . , D T ) and -R = (Ri, . . . , -Rt), then the 
joint likelihood function for data D and R is 

T 
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This joint likelihood function can be used to estimate parameters 6 by MLEs 
maximizing this likelihood. However, the likelihood involves numerical inte- 
gration with respect to the latent variables X. It is difficult to accurately 
compute these integrations, especially if the likelihood is used within numeri- 
cal maximization procedure. A straightforward and problem-free alternative 
is to take Bayesian approach and treat X in the same way as other param- 
eters, and formulate the problem in terms of the likelihood conditional on 
7 = (6, X). Then the required conditional likelihood is easily calculated as 

T 

^,h(7) =Hf(d t \x t ,0)f(r t \d t ,x t ,0). (13) 

t=\ 

avoiding integration with respect to X. Estimation based on this likelihood 
will be discussed in detail in Section 

4.2 Approximate Likelihood and Closed-Form MLEs 

Assuming a large number of firms in the portfolio, some approximation can 
be justified to find MLEs for the likelihood (Il2p. We adopt approach from 
[2], estimating the default process parameters #d = (p,p) and systematic 
factor X first, and then fitting the recovery parameters Or = (p, a, u). 

Given X t , the conditional default probability A t = A(X t ) is a monotonic 
function of X t ; see (|7|). The density of X t is the standard normal, thus the 
change of probability measure gives the density for A t at A t = \ t : 

/(Al i, D) = _^ exp (_|) 

where x t is the function of X t , the inverse of ([7]), 

xt = (V 1 ^) - Vl^~pS t ) /VP- (15) 

Here, 5 t = $ _1 (A(). For year t we observe default rate \l/ t that for J t — > oo 
approaches A t . Therefore, the likelihood for observed default rates ip = 
(V>i, ...,ifa) is 

T 

W D ) = l[f(X t \0 D ) (16) 
t=i 

with 5 t = $ _1 (V ; t)- Maximizing f|T6l) gives the following MLEs for p and p: 



(14) 
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where 5 = J2t=i^t/T and crj = Ylt=i(^t — °~) 2 /T. The factor X t is then 
estimated using ( 1T5|) with default parameters (p, p) replaced by MLEs as 



x t = [Q-^-y/l-pSA/y/p. (18) 



Given X t and _D t , the average recovery rate R t is from N(/ir, <t^) with mean 
Pr = A* + cV^Xt and variance cr|, = cr 2 (l — u)/d t . Thus the likelihood for 
T observations of the average recovery rate r — (fx, ... , fx) is 



x) = n te^p — ^m^] — ) • (19) 



t=l 



Diillmann and Trapp [2] estimate Or by MLEs via maximization of (fT9|) with 
respect to 6 R) where x t is replaced with x t . Due to numerical difficulties with 
maximization, they estimate a by the historical volatility of the recovery 
rate r t . However, re-parameterizing with <j\ = a^/Zj and o~2 = o~\/l — u, we 
derive the following closed- form solutions for MLEs of (p,a,u): 



A* = " ^ ^ , ^2 = J-^dtirt-p-^Xt) 2 , (21) 



?2 



"=-2^ 9=yJof+n- (22) 

°1 t" °2 



5 Bayesian Inference and MCMC 

The parameters 6 are unknown and it is important to account for this un- 
certainty when the capital is estimated. A standard frequentist approach 
to estimate this uncertainty is based on limiting results of normally dis- 
tributed MLEs for large datasets. We take Bayesian approach, because 
dataset is small and parameter uncertainty distribution is very different form 
normal. From a Bayesian perspective, both parameters and latent factor 
X are random variables. Given a prior density 7r("y) and a data likelihood 
^(yIt) = ^y(j), where 7 = (6, X) and Y is data vector, the density of 7 
conditional on Y = y (posterior density) is determined by the Bayes theorem 



7r(7|y) oc 7r(y|7)7r(7). 



(23) 
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The posterior can then be used for predictive inference and analysis of the 
uncertainties. There are many useful texts on Bayesian inference; e.g. see 
[TO] ; for recent examples in operational risk and insurance, see [T4"| [HI [7j. 

The explicit evaluation of posterior ( |23|) is often difficult and one can use 
MCMC method to sample from the posterior. In particular, MCMC allows 
to get samples of 6 and X from the joint posterior 7r(6, X\y). Then tak- 
ing samples of 6 marginally, we can get the posterior for model parameters 
7r(0|y), i.e. effectively integrating out the latent factor X. Similarly, taking 
samples of X marginally, we get the posterior for systematic factor 7r(X t \y). 
Posterior mean is commonly used point estimate. We adopt component-wise 
Metropolis-Hastings algorithm for sampling from posterior 7r(7|y), following 
the same procedure as in [T5| [S] . Other MCMC methods such as the uni- 
variate slice sampler utilized in [7] can also be used. For numerical efficiency, 
we work with parameter . Also, we assume a uniform prior for all 

parameters and the standard normal distribution as the prior for X±, . . . , Xt- 
The only subjective judgement we bring to the prior is the lower and upper 
bounds of the parameter values 



$-\ P ) e (-10,10), P e (o,i), /iG (o,i), a e (o.oi, l.o), we(o,i). 



The parameter support range should be sufficiently large so that the posterior 
is implied mainly by the observed data. We checked that an increase in 
parameter bounds did not lead to material difference in results. 

The starting value of the chain for the kth component is set to a uni- 
form random number drawn independently from the support (a&, In the 
single- component Metropolis-Hastings algorithm, we adopt a Gaussian den- 
sity (truncated below and above &&) for the proposal density. For each 
component the variance parameter of proposal was pre-tuned and adjusted 
so that the acceptance rate is close to 0.234 (optimal acceptance rate for d- 
dimensional target distributions with iid components as shown in [IT]). The 
chain is run for 100,000 posterior samples (after 20,000 "burn-in" samples). 

6 Bayesian Capital Estimates 

As discussed in [13], Bayesian methods are particularly convenient to quan- 
tify parameter uncertainty and its impact on capital estimate. Under the 
Bayesian approach, the full predictive density (accounting for parameter un- 
certainty) of the next time period loss Lt+i, given data Y = y, is 




(24) 
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assuming that, given 0, and Y are independent. Its quantile, 

Q p q = mi{z : Pr[L T+1 > z\Y] < 1 - q}, (25) 

can be used as a risk measure for EC. The procedure for simulating Lt+i 
from 021]) and calculating Q p is simple: 1) Draw a sample of 6 from the 
posterior n(0\y), e.g. using MCMC; 2) Given 6, simulate loss L following 
steps 1-4 in Algorithm 1; 3) Repeat steps 1-2 to obtain N samples of L; 4) 
Estimate Q q using samples of L in the standard way. 

Another approach under a Bayesian framework to account for parameter 
uncertainty is to consider a quantile Q q (&) of the loss density /(-|0), 

Q q {&) = M{z : Pr[L r+1 > *|0] < 1 - q}. (26) 

Given that is distributed as 7r(0|y), one can find the associated distribu- 
tion of Q q (@), form a predictive interval to contain the true quantile value 
with some probability and argue that the conservative estimate of the capital 
accounting for parameter uncertainty should be based on the upper bound 
of the interval. However it might be difficult to justify the choice of the con- 
fidence level for the interval. The procedure to obtain the posterior distribu- 
tion of quantile Q q (&) is simple: 1) Draw a sample of from the posterior 
n(6\y), e.g. using MCMC; 2) Compute Q q = Q q {0) using e.g. Algorithm 1; 
3) Repeat steps 1-2 to obtain iV samples of Q q (@). For limiting case of large 
number of borrowers, Step 2 can be approximated by a closed-form formula. 

The extra loading for EC due to parameter uncertainty can be formally 
defined as the difference between the quantile of the full predictive distri- 
bution accounting for parameter uncertainty Qa999 an d posterior mean of 
<5o.999(0), i-e. Qa999 — E[Q , 99g (@)]. 

7 Results using Moody's data 

Using historical data for the overall corporate default and recovery rates 
over 1982-2010 from Moody's report [6], we fit the model using MCMC and 
MLEs. Table [T] shows posterior summary and MLE for the model parameters 
(the coefficient of variation, CV, is defined as the ratio of standard deviation 
to the mean). Significant kurtosis and positive skewness in most parame- 
ters indicate that Gaussian approximation for parameter uncertainties is not 
appropriate. Also, all MLEs are within one standard deviation from the pos- 
terior mean. The posterior mean of systematic factor X t for 2009 is about 
-2.27, which corresponds to approximately 99% quantile level of the diver- 
sified portfolio. This maximum negative systematic factor for 2009 is the 
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Figure 1: MLE (dots) and posterior mean (solid line) of systematic factor 
X t . Error bars correspond to posterior standard deviation of X t . 

consequence of the disastrous 2008 when the bankruptcy of Lehman Broth- 
ers occurred. Comparison of MLE and posterior mean for latent factor X is 
shown in Figure [TJ 



Table 1: MLE and MCMC posterior statistics of the model parameters. 



item 


MLE 


Mode 


Mean 


Stdev 


Skewness 


Kurtosis 


CV 


V 


0.0167 


0.0177 


0.0179 


0.0028 


0.812 


4.62 


0.154 


P 


0.0635 


0.141 


0.0815 


0.024 


1.01 


4.35 


0.286 


/' 


0.411 


0.439 


0.414 


0.022 


0.309 


3.19 


0.055 


U) 


0.0192 


0.0717 


0.031 


0.016 


1.24 


5.39 


0.51 


a 


0.499 


0.449 


0.502 


0.070 


0.588 


3.63 


0.140 



The MCMC predictions on stressed PD, LGD and EC in comparison with 
corresponding MLEs are shown in Table [2j MLE for EC is 35% lower than 
the posterior mean, 24% lower than the posterior median and more than 
50% lower than the 0.75 quantile of the posterior for EC. The uncertainty in 
the posterior of EC is large, CV is about 34.5%; also note a large difference 
between the 0.75 and the 0.25 quantiles of EC posterior. Underestimation 
of EC by MLE in comparison with posterior estimates is significant due to 
large parameter uncertainty and large skeweness in EC posterior. Also, we 
get the following results for the 0.999 quantile Qo, 999 °f the ^ un predictive 
loss density for portfolios with different number of borrowers J: Qq 999 = 
(0.1454,0.1092,0.1026,0.1026) for J = (50, 500, 5000, oo) respectively The 
diversification effect when J increases is evident. In particular, Qq 999 at 
J = 500 is about 25% lower than the case at J = 50; and for J = 5000 
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is virtually the same as for the limiting case J = oo. Note that Qo'ggg at 
J = oo is about 50% larger than MLE for EC; and about 15% larger than 
the posterior mean of Qo°999(®)- The 15% impact of parameter uncertainty 
on EC gives indication that 1982 — 2010 dataset is long enough for a more or 
less confident use of the model for capital quantification. Of course, a formal 
model validation should be performed before final conclusion. 

Table 2: MLE and MCMC posterior statistics for PD, LGD and EC. 



item 


MLE 


Mean 


Stdev 


0.25Q 


0.5Q 


0.75Q 


CV 


PD 
LGD 

EC 


0.0819 
0.803 
0.0657 


0.103 
0.847 
0.0888 


0.029 
0.0562 
0.031 


0.0825 
0.808 
0.0672 


0.0968 
0.841 
0.0814 


0.116 
0.880 
0.101 


0.288 
0.066 
0.345 



8 Conclusion 

Presented methodology allows joint estimation of the model parameters and 
latent systematic risk factor in the well known LGD model via Bayesian ap- 
proach and MCMC method. This approach allows an easy calculation of the 
full predictive loss density /z, T+1 (-|y) accounting for parameter uncertainty; 
then the economic capital can be based on the high quantile of this distri- 
bution. Given small datasets typically used to fit the model, the parameter 
uncertainty is large and the posterior is very different from the normal distri- 
bution indicating that Gaussian approximation for parameter uncertainties 
(typically used under the frequentist maximum likelihood approach assuming 
large sample limit) is not appropriate. 

Due to data limitation, we assumed homogeneous portfolio and thus the 
results should be treated as illustration. However, the results demonstrate 
that the extra capital to cover parameter uncertainty can be significant and 
should not be disregarded by practitioners developing LGD models. The 
approach can be extended to deal with non-homogeneous portfolios, more 
than one latent factor and mean reversion in the systematic factor. It should 
not be difficult to incorporate macroeconomic factors as in [T2] . 
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